64 research outputs found
Regression on manifolds: Estimation of the exterior derivative
Collinearity and near-collinearity of predictors cause difficulties when
doing regression. In these cases, variable selection becomes untenable because
of mathematical issues concerning the existence and numerical stability of the
regression coefficients, and interpretation of the coefficients is ambiguous
because gradients are not defined. Using a differential geometric
interpretation, in which the regression coefficients are interpreted as
estimates of the exterior derivative of a function, we develop a new method to
do regression in the presence of collinearities. Our regularization scheme can
improve estimation error, and it can be easily modified to include lasso-type
regularization. These estimators also have simple extensions to the "large ,
small " context.Comment: Published in at http://dx.doi.org/10.1214/10-AOS823 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Inverse Optimization with Noisy Data
Inverse optimization refers to the inference of unknown parameters of an
optimization problem based on knowledge of its optimal solutions. This paper
considers inverse optimization in the setting where measurements of the optimal
solutions of a convex optimization problem are corrupted by noise. We first
provide a formulation for inverse optimization and prove it to be NP-hard. In
contrast to existing methods, we show that the parameter estimates produced by
our formulation are statistically consistent. Our approach involves combining a
new duality-based reformulation for bilevel programs with a regularization
scheme that smooths discontinuities in the formulation. Using epi-convergence
theory, we show the regularization parameter can be adjusted to approximate the
original inverse optimization problem to arbitrary accuracy, which we use to
prove our consistency results. Next, we propose two solution algorithms based
on our duality-based formulation. The first is an enumeration algorithm that is
applicable to settings where the dimensionality of the parameter space is
modest, and the second is a semiparametric approach that combines nonparametric
statistics with a modified version of our formulation. These numerical
algorithms are shown to maintain the statistical consistency of the underlying
formulation. Lastly, using both synthetic and real data, we demonstrate that
our approach performs competitively when compared with existing heuristics
- …